AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
ChatML Dialogue Optimization

# ChatML Dialogue Optimization

Dpopenhermes 7B V2
Apache-2.0
DPOpenHermes 7B v2 is the second RL fine-tuned model based on OpenHermes-2.5-Mistral-7B, utilizing Direct Preference Optimization (DPO) for reinforcement learning with the Intel/orca_dpo_pairs and allenai/ultrafeedback_binarized_cleaned preference datasets.
Large Language Model Transformers English
D
openaccess-ai-collective
30
31
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase